Predicting Rare Events In Temporal Domains

نویسندگان

  • Ricardo Vilalta
  • Sheng Ma
چکیده

Temporal data mining aims at finding patterns in historical data. Our work proposes an approach to extract temporal patterns from data to predict the occurrence of target events, such as computer attacks on host networks, or fraudulent transactions in financial institutions. Our problem formulation exhibits two major challenges: 1) we assume events being characterized by categorical features and displaying uneven inter-arrival times; such an assumption falls outside the scope of classical time-series analysis, 2) we assume target events are highly infrequent; predictive techniques must deal with the class-imbalance problem. We propose an efficient algorithm that tackles the challenges above by transforming the event prediction problem into a search for all frequent eventsets preceding target events. The class imbalance problem is overcome by a search for patterns on the minority class exclusively; the discrimination power of patterns is then validated against other classes. Patterns are then combined into a rule-based model for prediction. Our experimental analysis indicates the types of event sequences where target events can be accurately predicted.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning to Predict Extremely Rare Events

This paper describes Timeweaver, a genetic-based machine learning system that predicts events by identifying temporal and sequential patterns in data. This paper then focuses on the issues related to predicting rare events and discusses how Timeweaver addresses these issues. In particular, we describe how the genetic algorithm’s fitness function is tailored to handle the prediction of rare even...

متن کامل

Learning to Predict Rare Events in Event Sequences

Learning to predict rare events from sequences of events with categorical features is an important, real-world, problem that existing statistical and machine learning methods are not well suited to solve. This paper describes timeweaver, a genetic algorithm based machine learning system that predicts rare events by identifying predictive temporal and sequential patterns. Timeweaver is applied t...

متن کامل

Learning to Predict Rare Events in Categorical Time-Series Data

Learning to predict rare events from time-series data with non-numerical features is an important real-world problem. An example of such a problem is the task of predicting telecommunication equipment failures from network alarm data. For a variety of reasons, existing statistical and machine learning methods are not well suited to solving this class of problems. This paper describes timeweaver...

متن کامل

Simulation of rainfall temporal distribution pattern using WRF Model (case study of Parsian dam basin)

During the rainfall, the intensity of precipitation varies. Changes in the amount of precipitation during an event of rainfall are effective in the resulting of flood and its intensity. Knowledge of how rainfall changes over time during rainfall is determined by temporal distribution pattern of rainfall. For this purpose, availability of short-term time scales rainfalls data are important that ...

متن کامل

An Integrated framework for Mining Temporal Logs from Fluctuating Events

The importance of mining time lags of hidden temporal dependencies from sequential data is highlighted in many domains including system management, stock market analysis, climate monitoring, and more. Mining time lags of temporal dependencies provides useful insights into the understanding of sequential data and predicting its evolving trend. Traditional methods mainly utilize the predefined ti...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002